CNTRLPLANE-2549:test/e2e: migrate refresh-CA test for OTE compatibility#306
CNTRLPLANE-2549:test/e2e: migrate refresh-CA test for OTE compatibility#306wangke19 wants to merge 5 commits intoopenshift:mainfrom
Conversation
Migrates the refresh-CA test to the OTE (Openshift Test Extended)
Ginkgo framework while maintaining dual-compatibility with traditional
Go tests.
Changes:
- Add Ginkgo test context in e2e.go for OTE test discovery:
* refresh-CA - Tests CA regeneration when deleted and recreated
- Extract test logic into shared function with testing.TB interface:
* testRefreshCA() - Verifies serving certs and configmaps update
when CA secret is deleted and recreated
- Add helper functions in e2e.go:
* pollForCABundleInjectionConfigMapWithReturn() - Polls for configmap
* pollForCARecreation() - Polls for CA secret recreation
- Keep test runner in e2e_test.go that calls shared function
- Remove duplicate helper functions from e2e_test.go
Net change: +30 lines (132 added - 102 removed)
Both test frameworks continue to work:
- Standard Go test: go test -run "^TestE2E$/^refresh-CA$"
- OTE: Ginkgo test discovery via service-ca-operator-tests-ext
WalkthroughAdds a consolidated end-to-end CA refresh test ( Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@test/e2e/e2e.go`:
- Around line 1230-1317: The test captures configmapCopy before verifying the
injected CA data and calls pollForSecretChangeGinkgo for the headless secret
with no keys (which returns immediately); fix by moving the configmapCopy =
configmap.DeepCopy() to after checkConfigMapCABundleInjectionData(adminClient,
testConfigMapName, ns.Name) so the baseline includes injected data, and call
pollForSecretChangeGinkgo(t, adminClient, headlessSecretCopy, v1.TLSCertKey,
v1.TLSPrivateKeyKey) (same keys used for the regular secret) so the headless
secret change is actually validated; locate symbols
pollForServiceServingSecretWithReturn, configmapCopy,
checkConfigMapCABundleInjectionData, pollForConfigMapChange, headlessSecretCopy,
and pollForSecretChangeGinkgo to make the edits.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (2)
test/e2e/e2e.gotest/e2e/e2e_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
test/e2e/e2e.gotest/e2e/e2e_test.go
🧬 Code graph analysis (1)
test/e2e/e2e.go (1)
pkg/controller/api/api.go (1)
InjectionDataKey(29-29)
🔇 Additional comments (3)
test/e2e/e2e_test.go (1)
697-703: Clean delegation to shared refresh-CA test.Nice consolidation to the shared
testRefreshCApath for OTE compatibility.test/e2e/e2e.go (2)
106-111: refresh-CA Ginkgo context integration looks good.Clear test discovery wiring for OTE.
1319-1349: Polling helpers are straightforward and consistent.Timeout usage and error handling align with the other poll helpers.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
|
@wangke19: This pull request references CNTRLPLANE-2549 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test e2e-aws-operator-serial-ote |
2 similar comments
|
/test e2e-aws-operator-serial-ote |
|
/test e2e-aws-operator-serial-ote |
Move headless-stateful-serving-cert-secret-delete-data context to correct position. It should run after serving-cert-secret-delete-data and before ca-bundle tests. This fixes the test execution sequence to match e2e_test.go order: 1. serving-cert-secret-delete-data 2. headless-stateful-serving-cert-secret-delete-data 3. ca-bundle-injection-configmap 4. ca-bundle-injection-configmap-update 5. vulnerable-legacy-ca-bundle-injection-configmap 6. metrics 7. refresh-CA
Add g.Ordered decorator to Ginkgo Describe block to guarantee tests run in declaration order. Without g.Ordered, Ginkgo randomizes test execution order by default, even with Parallelism: 1. This causes OTE tests to fail because the tests have state dependencies and must run in the exact order defined in e2e_test.go. The g.Ordered decorator ensures tests execute in this sequence: 1. serving-cert-annotation 2. serving-cert-secret-modify-bad-tlsCert 3. serving-cert-secret-add-data 4. serving-cert-secret-delete-data 5. headless-stateful-serving-cert-secret-delete-data 6. ca-bundle-injection-configmap 7. ca-bundle-injection-configmap-update 8. vulnerable-legacy-ca-bundle-injection-configmap 9. metrics 10. refresh-CA This matches the execution order in traditional Go tests.
|
@wangke19: This pull request references CNTRLPLANE-2549 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
The CI job e2e-aws-operator-serial-ote fails because the refresh-CA test deletes the cluster service CA signing key, causing cluster-wide TLS disruption that OTE monitors detect as failures. Split the single suite into a Stable suite (strict monitoring) and a Disruptive suite (relaxed thresholds) so monitors adjust expectations appropriately. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Associataed PR openshift/release#74807 |
|
@wangke19: This pull request references CNTRLPLANE-2549 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@wangke19: This pull request references CNTRLPLANE-2549 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@wangke19: This pull request references CNTRLPLANE-2549 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
wangke19
left a comment
There was a problem hiding this comment.
Thanks for the review! You're correct that g.Ordered is overridden in CI by OTE's suite configuration. However, I'd like to keep it for local development workflows:
- Local
go testruns: When developers rungo test ./test/e2e/...locally (without OTE),g.Orderedensures tests execute in the correct order - Explicit documentation: It makes the test dependencies clear to anyone reading the code
- No harm in CI: Since OTE overrides it, keeping it doesn't cause any issues in CI
The tests have intentional state dependencies (CA creation, rotation, deletion), so having g.Ordered as a safety net for local testing seems valuable. What do you think?
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: gangwgr, wangke19 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| extension.AddSuite(oteextension.Suite{ | ||
| Name: "openshift/service-ca-operator/operator/serial-disruptive", | ||
| Parallelism: 1, | ||
| ClusterStability: oteextension.ClusterStabilityDisruptive, |
There was a problem hiding this comment.
This is interesting. Do we know which monitor test failed exactly?
There was a problem hiding this comment.
| Parallelism: 1, | ||
| Name: "openshift/service-ca-operator/operator/serial", | ||
| Parallelism: 1, | ||
| ClusterStability: oteextension.ClusterStabilityStable, |
There was a problem hiding this comment.
What is the default ClusterStability? Could we use it ?
There was a problem hiding this comment.
Good point.
ClusterStabilityStable is defined in the vendored openshift-tests-extension package at pkg/extension/types.go line 54:
ClusterStabilityStable ClusterStability = "Stable"
Yes, the default ClusterStability is Stable — when it's empty/unset, openshift-tests defaults to Stable (ref). So we can remove the explicit ClusterStabilityStable for the non-disruptive suite and only set ClusterStabilityDisruptive for the disruptive one. Will update.
@wangke19 will |
You're right — it won't. There's no Ginkgo bootstrap ( Since |
- Remove g.Ordered decorator since OTE overrides it in CI and go test does not run Ginkgo specs - Remove explicit ClusterStabilityStable as it is the default when ClusterStability is unset
|
New changes are detected. LGTM label has been removed. |
|
@wangke19: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
Split the OTE suite into stable and disruptive suites to fix CI failures caused by the
refresh-CAtest's cluster-wide TLS disruption.Problem
The CI job
e2e-aws-operator-serial-otefails because therefresh-CAtest deletes the cluster's service CA signing key (signing-keysecret inopenshift-service-ca), causing cluster-wide TLS disruption. The OTE framework runsopenshift-testsmonitors alongside the tests that detect this disruption and report failures. The current suite does not setClusterStability, which defaults toStable(zero disruption expected).Solution
Split into two suites:
openshift/service-ca-operator/operator/serial): 13 non-disruptive tests with strict cluster health monitoring (ClusterStability: Stable)openshift/service-ca-operator/operator/serial-disruptive): 1 test (refresh-CA) with relaxed monitor thresholds (ClusterStability: Disruptive)Changes
1. Tag refresh-CA test as [Disruptive] in
test/e2e/e2e.go2. Split into two suites in
cmd/service-ca-operator-tests-ext/main.goReplace the single suite with two:
Files Modified
cmd/service-ca-operator-tests-ext/main.gotest/e2e/e2e.go[Disruptive]tag to refresh-CA test nameVerification Results
Build
make build- Bothservice-ca-operatorandservice-ca-operator-tests-extbinaries compiled successfullyOTE Suite Verification
list suitesconfirmed two suites with correctClusterStabilitysettingslist testsconfirmed 14 total tests; refresh-CA tagged[Disruptive]routed to disruptive suiteSuite breakdown:
openshift/service-ca-operator/operator/serial): 13 non-disruptive tests,ClusterStability: Stableopenshift/service-ca-operator/operator/serial-disruptive): 1 test (refresh-CA),ClusterStability: DisruptiveTest Execution
Related Work
CI Configuration
A companion PR has been created in openshift/release to add the new CI job for the disruptive suite:
e2e-aws-operator-serial-disruptive-oteFuture Work
Future CA rotation test migrations (
time-based-ca-rotation,forced-ca-rotation) should also use the[Disruptive]tag so they are automatically routed to the disruptive suite.Commit
ote-migrate-refresh-cae60fe227dtest/e2e: split OTE suite into stable and disruptive for refresh-CAorigin/ote-migrate-refresh-ca